Machine learning-based model to predict delirium in patients with advanced cancer treated with palliative care: a multicenter, patient-based registry cohort

This study aimed to present a new approach to predict to delirium admitted to the acute palliative care unit. To achieve this, this study employed machine learning model to predict delirium in patients in palliative care and identified the significant features that influenced the model. A multicenter, patient-based registry cohort study in South Korea between January 1, 2019, and December 31, 2020. Delirium was identified by reviewing the medical records based on the criteria of the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition. The study dataset included 165 patients with delirium among 2314 patients with advanced cancer admitted to the acute palliative care unit. Seven machine learning models, including extreme gradient boosting, adaptive boosting, gradient boosting, light gradient boosting, logistic regression, support vector machine, and random forest, were evaluated to predict delirium in patients with advanced cancer admitted to the acute palliative care unit. An ensemble approach was adopted to determine the optimal model. For k-fold cross-validation, the combination of extreme gradient boosting and random forest provided the best performance, achieving the following accuracy metrics: 68.83% sensitivity, 70.85% specificity, 69.84% balanced accuracy, and 74.55% area under the receiver operating characteristic curve. The performance of the isolated testing dataset was also validated, and the machine learning model was successfully deployed on a public website (http://ai-wm.khu.ac.kr/Delirium/) to provide public access to delirium prediction results in patients with advanced cancer. Furthermore, using feature importance analysis, sex was determined to be the top contributor in predicting delirium, followed by a history of delirium, chemotherapy, smoking status, alcohol consumption, and living with family. Based on a large-scale, multicenter, patient-based registry cohort, a machine learning prediction model for delirium in patients with advanced cancer was developed in South Korea. We believe that this model will assist healthcare providers in treating patients with delirium and advanced cancer.


Data source and study population
Our study utilized a multicenter, patient-based registry cohort collected from four hospitals in South Korea: Seoul National University Bundang Hospital, Yonsei University Severance Hospital, CHA University Bundang Medical Center, and Seoul National University Hospital.We identified potential participants as patients with advanced cancer admitted to the APCU at four centers between January 1, 2019, and December 31, 2020.Of the 2328 patients who met the eligibility criteria: (1) aged 20 years or older; (2) diagnosed with advanced solid cancer; and (3) admitted to the APCU.We excluded five patients with a hospital stay exceeding 3 months, six patients transferred to other departments, and three patients with terminal delirium, defined as delirium that occurred within 2 weeks of death.Our final sample consisted of 2314 patients with advanced cancer who were admitted to the APCU and who met all eligibility criteria 15 .
The study protocol received approval from the Institutional Review Boards of each center (CHA University, CHAMC 2021-03-054-002; Seoul National University, H-2103-028-1201; Seoul National University Bundang Hospital, B-2104/681-405; and Yonsei University, 4-2021-0323).The requirement for informed consent was waived by the Institutional Review Board of each center (CHA University; Seoul National University; Seoul National University Bundang Hospital; and Yonsei University) because only anonymized data were examined.The researchers of this study confirm that all methods were performed in accordance with the relevant guidelines and regulations.Especially, this research followed the guidelines outlined in the transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD) statement (Table S1).

Variables for machine learning
A total of 39 variables were used in this study, and the justification of the selection was selected based on several previous studies predicting delirium and the available variables in the APCU [16][17][18] .Based on these results, we proceeded with the establishment of a national registry, excluding the use of data for which construction was deemed infeasible.Additionally 19 , within the National Registry Project.The dataset included general information 20,21 such as age, sex, chemotherapy during hospitalization, living situation, medical aid recipients, education level, use of glasses or hearing aids, and history of alcohol consumption and smoking.Clinical risk factors such as obesity, blood pressure, and body temperature, various laboratory results like blood tests and C-reactive protein levels, and a history of diseases including delirium, cardiovascular disease, diabetes mellitus, respiratory disease, liver disease, mental illness, and head injury were also collected.We aimed to ascertain the onset of delirium in patients with advanced cancer immediately upon APCU admission, hence all baseline datasets consist of data obtained at the time of admission to the APCU.
To identify delirium, we reviewed medical records based on the criteria outlined in the Fifth Edition of the Diagnostic and Statistical Manual of Mental Disorders.A well-trained physician and an academic nurse conducted this detailed review.Based on previous validation study, we did not use the code from the 10th revision of the International Classification of Diseases because it was deemed unreliable with low sensitivity 22 .Instead, we recorded all potential symptoms, signs, and associated medications and had at least two specialists (BDK and YJK) review each case.In case of any disagreement between the specialists, an additional specialist (SHY) was consulted to make the final decision.
The primary objective of this study was to predict the occurrence of delirium in patients with advanced cancer admitted to the APCU using machine learning models.To achieve this, the data were split into a training-to-testing ratio of 80:20, with the training set comprising 1851 (80%) patients and the testing set comprising 463 (20%) patients.Feature normalization was performed by initially computing the mean and standard deviation of each feature within the training set.Subsequently, this normalization procedure was applied to both the training and testing datasets, to ensure that the mean values were centered at zero and the standard deviations were scaled to one.The proposed machine learning models underwent validated through a stratified fivefold cross-validation process on the training data, followed by further validation using independent testing data [23][24][25][26][27] .

Machine learning models and evaluation metrics
We evaluated seven machine learning algorithms for predicting delirium in patients with advanced cancer: extreme gradient boosting (XGBoost), adaptive boosting (AdaBoost), gradient boosting (GBM), light gradient boosting (LGBM), logistic regression (LR), support vector machine (SVM), and random forest (RF).For these seven machine learning algorithms, which were optimized by input parameters and hyperparameters, we applied an exhaustive search, which used to brute force through all possible combinations of a set of the hyperparameter combination yielding the best performance, with fivefold cross validation for each model to identify the most optimal hyperparameters.To estimate the uncertainty and variability of our results, we calculated the AUROC, sensitivity, specificity, accuracy, and balanced accuracy scores during the fivefold cross-validation process.These metrics were calculated by the following formulas with values of true positive (TP), true negative (TN), false positive (FP), false negative (FN) for binary classification: We adopted AUROC, which is commonly used in binary classification and is not sensitive to class imbalances representing the relationship between the true positive rate (TPR) and the false positive rate (FPR) as the threshold changes, as the evaluation metric for measuring the overall performance of the model.
To further enhance the performance of the machine learning model, we employed an ensemble approach.This technique combines multiple models to improve prediction accuracy and robustness.We created various groups of models by combining all possible model combinations and evaluated their performances to determine the best combination.This approach leveraged the strengths of each individual model while mitigating any weaknesses or limitations.
For each of the best performing machine learning models, we investigated the feature importance, which is a measure of how influential a feature was in splitting a class when branching a node in a tree-based model.

Machine learning-driven public website development
We also deployed our machine learning model on a public website (http:// ai-wm.khu.ac.kr/ Delir ium/), enabling the prediction of delirium when provided with information from 39 patients.Upon accessing the website, users enter patient information, which is encoded on the website server, allowing for an immediate delirium prediction result.No private information beyond the selected 39 pieces of data needed to be entered, and all entered information was promptly deleted once the prediction result was obtained, ensuring no risk of information exposure.

Informal consent
The institutional review board of the four centers approved this study and waived the requirement for informed consent because only anonymized data were examined.

Results
This study was utilized a multicenter patient-based registry cohort collected from four hospitals in South Korea to develop and investigate the machine learning model for predicting delirium in patients with advanced cancer.Table 1 displays the baseline characteristics of the study population.In the original cohort, 165 (7.1%) patients experienced delirium.Table 2 summarizes the fivefold cross validation accuracy comparison of each model and the ensemble machine learning model using the accuracy metrics of sensitivity, specificity, balanced accuracy, and AUROC.In terms of balanced accuracy and AUROC, the three models-RF, XGBoost, and LGB-demonstrated the highest performance compared with the other single models.To further improve classification performance, we adopted an ensemble approach using three single models with higher performance: RF, XGBoost, and LGB.The results revealed that the combination of XGBoost and RF provided the most optimal performance, achieving the following accuracy metrics: 68.83% sensitivity, 70.85% specificity, 69.84% balanced accuracy, and 74.55% AUROC.Subsequently, we performed feature importance analysis using an ensemble model that combines XGBoost and RF.We averaged and normalized the values of feature importance from the two models and ranked each feature.Figure 1 presents the normalized values of ranked feature importance from all 39 features used to predict delirium in patients with advanced cancer.The results indicated that sex (1.00) had the highest importance value and was the primary contributor to predicting delirium, followed by a history of delirium (0.82), chemotherapy during hospitalization (0.81), smoking status (0.73), alcohol consumption (0.67), living with family (0.49), and age (0.47).
We validated the performance of the machine learning models using an isolated testing dataset.Table 3 summarizes the delirium prediction results of the test dataset.The results also showed that the combination of XGBoost and RF provided the most optimal performance with the following accuracy metrics: 75.76% sensitivity, 52.63% specificity, 64.19% balanced accuracy, and 73.11% AUROC.Compared with the fivefold cross validation results, the accuracy metrics of balanced accuracy and AUROC were similar to the testing data results, indicating minimal overfitting or underfitting in the model.Furthermore, we deployed our artificial intelligence (AI) on a public website (http:// ai-wm.khu.ac.kr/ Delir ium/) to allow public access to the delirium prediction results in patients with advanced cancer.Figure 2 displays the website of the deployed AI model.Figure 2a illustrates the user web interface for entering information, where users inputs 39-feature data such as sex, age, chemotherapy during hospitalization, living with family, medical aid recipients, and education levels.Upon entering the information into the web application, users can immediately obtain the delirium prediction results, as shown in Fig. 2b.The prediction results include the probability of mortality.

Key findings
The results suggest that machine learning models can predict delirium in patients with advanced cancer admitted to the APCU with relatively high accuracy.The combination model of XGBoost and RF demonstrated the best performance for predicting delirium in these patients, achieving a balanced accuracy of 69.84% and an AUROC of 74.55%.This performance was validated through both k-fold cross-validation and testing on an isolated   dataset.Notably, sex emerged as the most critical feature for predicting delirium in patients with advanced cancer, followed by a history of delirium, chemotherapy during hospitalization, smoking status, alcohol consumption, living with family, and advanced age.To the best of our knowledge, this study represents the first attempt to use the machine learning model to predict delirium in South Korean patients with advanced cancer.These findings underscore the importance of delirium screening in APCU-admitted patients with advanced cancer and contribute to identifying the most significant risk factors for this patient group.

Comparison of previous studies
Our results, particularly in the combination model of XGBoost and RF, corroborate previously reported risk factors associated with delirium.Earlier research indicated that advanced age, a history of delirium, smoking status, alcohol consumption, and sex were associated with delirium in patients with advanced cancer admitted to the APCU [31][32][33][34] .Male sex was identified as a significant risk factor for neuropsychiatric disorders, potentially due to the protective role of estrogen in individuals with potential cognitive impairments 35,36 .Males may exhibit more pronounced neuropsychiatric disorders under acute stress, driven by different corticotropin-releasing factor signaling pathways compared with females 37 .Consistent with prior studies, our findings highlight old age as a significant risk factor for delirium in patients with advanced cancer [38][39][40] , with possible contributing factors being atherosclerosis and malnutrition common in older patients [40][41][42] .The association of cigarette smoking with delirium is attributed to nicotine withdrawal during hospitalization 1 .Smokers have been noted to display more severe agitation, characteristic of hyperactive delirium 43 .Changes in various neurotransmitter systems, including dopamine, opioids, and cholinergic systems, have been implicated in shared hyperactive delirium 44 .The relationship between chemotherapeutic agents and delirium remains controversial and inconsistent, as reported in single case reports or studies with small populations.Previous studies have suggested that patients who undergo multiple chemotherapy regimens could experience delirium, which may occur in approximately one in 11 adults receiving chemotherapy 45,46 .Chemotherapeutic agents may penetrate the blood-brain barrier, potentially serving as a risk factor for delirium 47,48 .Similar to our study, a previous study was conducted to predict delirium in patients with advanced cancer receiving pharmacological intervention through a visually interpretable prediction model 9 .This study has the advantage of being easy to use with small number of variables, but it is dependent on Delirium Rating Scale Revised-98 and has a limitation in predicting delirium within three days.On the other hand, our study provided a web application with public access with a machine learning model, and could serve as a medical aid for healthcare providers to monitor the delirium in the patients with advanced cancer.

Strengths and limitations
The primary strength of this study lies in the relatively high accuracy of the machine learning model for detecting delirium in patients with advanced cancer, as validated by testing datasets.Consistently high AUC values in both the training and testing datasets indicate that the combination model of XGBoost and RF is capable of predicting delirium in patients with advanced cancer.Important predictors of delirium include sex, history of delirium, chemotherapy during hospitalization, smoking status, alcohol consumption, living with family, and advanced age.The dataset was collected from four academic cancer centers, involving oncology-trained physicians and healthcare providers, providing a comprehensive view of risk factors associated with delirium in patients with advanced cancer and potentially aiding in the development of effective preventive interventions.
However, this study had several limitations.Firstly, he datasets were collected from patients admitted to four hospitals and were heterogeneous, potentially limiting the generalizability of the model to the general population.Secondly, delirium assessment tools, diagnostic criteria, observation frequency, and timeframes may differ from those used in clinical trials.Thirdly, machine learning models often benefit from larger datasets, but the sample size of this study was limited.Fourthly, our proposed machine learning model underperformed compared to previous studies predicting delirium across varying patient conditions 49,50 .Given the limitations of our registry construction project, we did not collect data at various time points.Additional research may be necessary to address this gap.Fifthly, dataset of this study lacks information pertaining to delirium-related medications or disease history.However, we have initiated the establishment of a new prospective cohort to supplement the inadequate input data values.Consequently, we plan to conduct further research to develop more sophisticated machine learning modeling through subsequent studies.Finally, due to the retrospective design of our registry for patients with advanced cancer, it was not feasible to distinguish between different types of delirium (hyperactivity, hypoactivity, and mixed type).We are fully aware of this limitation, and currently, in our newly established prospective cohort, we are making efforts to differentiate between them.To apply machine learning models and achieve external validation, a larger sample size dataset is required.Lastly, an imbalance in the number of patients in each group may limit the performance of the models 51,52 .

Clinical and policy implications
To the best of our knowledge, this study represents the first creation of a machine learning model for predicting delirium in patients with advanced cancer admitted to the APCU.The use of this machine learning model for delirium prediction in APCU-admitted patients with advanced cancer can significantly improve patient quality of life and reduce physician workload.Especially for Korean healthcare providers with less educational experience in delirium 53 , the machine learning-based delirium prediction model of patients with advanced cancer could be part of a medical aid.Delirium episodes are particularly common in patients with advanced cancer in the APCU, with prevalence increasing as the terminal phase of the illness approaches.However, delirium in these patients has been inadequately identified and managed.Our model has the potential to profoundly impact risk assessment, early detection, and effective interventions for delirium in patients with advanced cancer.

Conclusion
Using a large-scale multicenter patient-based registry cohort, we have successfully developed the machine learning prediction model for delirium in South Korean patients with advanced cancer.Our study revealed that the combination of XGBoost and RF delivered the most optimal performance, a conclusion validated by the results of both k-fold cross-validation and the isolated testing dataset.Additionally, we identified sex was the primary predictor of delirium, followed by history of delirium, chemotherapy, smoking status, alcohol consumption, and living with family.Furthermore, we have made our AI accessible to the public through a dedicated website (http:// ai-wm.khu.ac.kr/ Delir ium/) to provide delirium prediction results for patients with advanced cancer.Although external validation using prospectively collected data may be necessary to further refine and validate the model, we have implemented a web application to gather additional data.Notably, the application does not store any user-entered information at present.However, we have plans to securely store the user-entered information with their consent, facilitating a real-time learning process to enhance the machine learning model.

Figure 1 .
Figure 1.Ranked feature importance values for all 39 features.WBC white blood cell count, PLT platelets, AST aspartate transaminase, ALT alanine transaminase, BUN blood urea nitrogen.

Figure 2 .
Figure 2. Deployed web application predicting delirium: (a) user input, (b) prediction results with delirium probability in patients with advanced cancer.

Table 1 .
Included variables for an artificial intelligence model and patient information (total n = 2314).DBP diastolic blood pressure, SD standard deviation, SBP systolic blood pressure.

Table 2 .
Five -fold cross validation result comparison according to machine learning models.GB gradient Boosting, SVM support vector machine, AdaBoost adaptive boosting, XGBoost extreme gradient boosting, RF random forest.The combination of XGBoost and RF provided the most optimal performance, as indicated in bold.

Table 3 .
Delirium prediction results from the testing dataset.GB Gradient Boosting, SVM support vector machine, AdaBoost adaptive boosting, XGBoost extreme Gradient Boosting, RF random forest.The combination of XGBoost and RF provided the most optimal performance, as indicated in bold.